AITopics | detection strategy

Collaborating Authors

detection strategy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary material for CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence 1 Dataset Documentations 1 1.1 Hosted URLs

Neural Information Processing SystemsOct-10-2025, 03:35:26 GMT

This task assesses the LLMs' ability to evaluate the severity of This task tests the LLMs' capability to The dataset consists of 5 TSV files, each corresponding to a different task. "Prompt" column used to pose questions to the LLM. Most files also include a "GT" column that Do not distribute. of LLMs to understand and analyze various aspects of open-source CTI. The dataset includes URLs indicating the sources from which the data was collected. A permanent DOI identifier is associated with the dataset: DOI: AI4Sec (2024).

ctibench, dataset, llm, (12 more...)

Neural Information Processing Systems

Industry:

Information Technology > Security & Privacy (1.00)
Law (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Backdoor defense, learnability and obfuscation

Christiano, Paul, Hilton, Jacob, Lecomte, Victor, Xu, Mark

arXiv.org Artificial IntelligenceSep-4-2024

We introduce a formal notion of defendability against backdoors using a game between an attacker and a defender. In this game, the attacker modifies a function to behave differently on a particular input known as the "trigger", while behaving the same almost everywhere else. The defender then attempts to detect the trigger at evaluation time. If the defender succeeds with high enough probability, then the function class is said to be defendable. The key constraint on the attacker that makes defense possible is that the attacker's strategy must work for a randomly-chosen trigger. Our definition is simple and does not explicitly mention learning, yet we demonstrate that it is closely connected to learnability. In the computationally unbounded setting, we use a voting algorithm of Hanneke et al. (2022) to show that defendability is essentially determined by the VC dimension of the function class, in much the same way as PAC learnability. In the computationally bounded setting, we use a similar argument to show that efficient PAC learnability implies efficient defendability, but not conversely. On the other hand, we use indistinguishability obfuscation to show that the class of polynomial size circuits is not efficiently defendable. Finally, we present polynomial size decision trees as a natural example for which defense is strictly easier than learning. Thus, we identify efficient defendability as a notable intermediate concept in between efficient learnability and obfuscation.

detection strategy, probability, representation class, (14 more...)

arXiv.org Artificial Intelligence

2409.03077

Country: Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Handling Ontology Gaps in Semantic Parsing

Bacciu, Andrea, Damonte, Marco, Basaldella, Marco, Monti, Emilio

arXiv.org Artificial IntelligenceJun-27-2024

The majority of Neural Semantic Parsing (NSP) models are developed with the assumption that there are no concepts outside the ones such models can represent with their target symbols (closed-world assumption). This assumption leads to generate hallucinated outputs rather than admitting their lack of knowledge. Hallucinations can lead to wrong or potentially offensive responses to users. Hence, a mechanism to prevent this behavior is crucial to build trusted NSP-based Question Answering agents. To that end, we propose the Hallucination Simulation Framework (HSF), a general setting for stimulating and analyzing NSP model hallucinations. The framework can be applied to any NSP task with a closed-ontology. Using the proposed framework and KQA Pro as the benchmark dataset, we assess state-of-the-art techniques for hallucination detection. We then present a novel hallucination detection strategy that exploits the computational graph of the NSP model to detect the NSP hallucinations in the presence of ontology gaps, out-of-domain utterances, and to recognize NSP errors, improving the F1-Score respectively by ~21, ~24% and ~1%. This is the first work in closed-ontology NSP that addresses the problem of recognizing ontology gaps. We release our code and checkpoints at https://github.com/amazon-science/handling-ontology-gaps-in-semantic-parsing.

hallucination, nsp model, utterance, (15 more...)

arXiv.org Artificial Intelligence

2406.19537

Country:

North America > Canada (0.14)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Decoding the AI Pen: Techniques and Challenges in Detecting AI-Generated Text

Abdali, Sara, Anarfi, Richard, Barberan, CJ, He, Jia

arXiv.org Artificial IntelligenceJun-26-2024

Large Language Models (LLMs) have revolutionized the field of Natural Language Generation (NLG) by demonstrating an impressive ability to generate human-like text. However, their widespread usage introduces challenges that necessitate thoughtful examination, ethical scrutiny, and responsible practices. In this study, we delve into these challenges, explore existing strategies for mitigating them, with a particular emphasis on identifying AI-generated text as the ultimate solution. Additionally, we assess the feasibility of detection from a theoretical perspective and propose novel research directions to address the current limitations in this domain.

ai-generated text, language model, llm, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3637528.3671463

2403.0575

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Washington > King County > Redmond (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals

Clymer, Joshua, Juang, Caden, Field, Severin

arXiv.org Artificial IntelligenceMay-11-2024

Like a criminal under investigation, Large Language Models (LLMs) might pretend to be aligned while evaluated and misbehave when they have a good opportunity. Can current interpretability methods catch these 'alignment fakers?' To answer this question, we introduce a benchmark that consists of 324 pairs of LLMs fine-tuned to select actions in role-play scenarios. One model in each pair is consistently benign (aligned). The other model misbehaves in scenarios where it is unlikely to be caught (alignment faking). The task is to identify the alignment faking model using only inputs where the two models behave identically. We test five detection strategies, one of which identifies 98% of alignment-fakers.

alignment, alignment faker, scenario, (14 more...)

arXiv.org Artificial Intelligence

2405.05466

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.86)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Australian universities to return to 'pen and paper' exams after students caught using AI to write essays

#artificialintelligenceJan-11-2023, 01:25:43 GMT

Australian universities have been forced to change the way they run exams and other assessments amid fears students are using emerging artificial intelligence software to write essays. Major institutions have added new rules which state that the use of AI is cheating, with some students already caught using the software. But one AI expert has warned universities are in an "arms race" they can never win. ChatGPT, which generates text on any subject in response to a prompt or query, was launched in November by OpenAI and has already been banned across all devices in New York's public schools due to concerns over its "negative impact on student learning" and potential for plagiarism. In London, one academic tested it against a 2022 exam question and said the AI's answer was "coherent, comprehensive and sticks to the points, something students often fail to do", adding he would have to "set a different kind of exam" or deprive students of internet access for future exams. In Australia, academics have cited concerns over ChatGPT and similar technology's ability to evade anti-plagiarism software while providing quick and credible academic writing.

large language model, machine learning, natural language, (17 more...)

#artificialintelligence

Country:

Oceania > Australia (0.75)
North America > United States > New York (0.25)

Industry:

Education > Educational Setting (0.51)
Information Technology > Security & Privacy (0.31)

Technology:

Information Technology > Artificial Intelligence > Applied AI (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

A Simple Unified Framework for Anomaly Detection in Deep Reinforcement Learning

Zhang, Hongming, Sun, Ke, Xu, Bo, Kong, Linglong, Müller, Martin

arXiv.org Artificial IntelligenceSep-20-2021

Abnormal states in deep reinforcement learning~(RL) are states that are beyond the scope of an RL policy. Such states may make the RL system unsafe and impede its deployment in real scenarios. In this paper, we propose a simple yet effective anomaly detection framework for deep RL algorithms that simultaneously considers random, adversarial and out-of-distribution~(OOD) state outliers. In particular, we attain the class-conditional distributions for each action class under the Gaussian assumption, and rely on these distributions to discriminate between inliers and outliers based on Mahalanobis Distance~(MD) and Robust Mahalanobis Distance. We conduct extensive experiments on Atari games that verify the effectiveness of our detection strategies. To the best of our knowledge, we present the first in-detail study of statistical and adversarial anomaly detection in deep RL algorithms. This simple unified anomaly detection paves the way towards deploying safe RL systems in real-world applications.

breakout, outlier, robust md 0, (14 more...)

arXiv.org Artificial Intelligence

2109.09889

Country:

North America > Canada > Alberta (0.14)
North America > United States (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (0.54)
Education (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Innovative Attack Modelling and Attack Detection Approach for a Waiting Time-based Adaptive Traffic Signal Controller

Dasgupta, Sagar, Hollis, Courtland, Rahman, Mizanur, Atkison, Travis

arXiv.org Artificial IntelligenceAug-19-2021

However, the evolution of mainstream transportation systems towards a connected cyber infrastructure, such as connected traffic signal controllers, is increasing system vulnerability to potential cyber attack, allowing malicious actors (individuals, criminals, or terrorist organizations) to exploit security vulnerabilities of such transportation infrastructure (1)-(3). In the U.S., the number of cyberattacks on smart mobility systems has jumped significantly in recent years (4). As vehicles are moving towards connected and automated driving, and cities are focusing on creating a transportation cyber infrastructure that will transform legacy transportation infrastructure to connected, adaptable, and automated systems, the security problems will only increase and further compromise public safety (5). Many studies show that a cyber attack on connected vehicle-based (CV-based) traffic signal control algorithms can break down a traffic network by creating severe congestion (6-10). An adaptive traffic signal controller (ATSC) combined with a connected vehicle (CV) concept uses real-time vehicle trajectory data to regulate green time; this combination also has the ability to reduce intersection waiting time significantly and improve travel time in a signalized corridor (11). A CV-based ATSC can be manipulated in two ways: (i) gain access through vulnerabilities and exploit the ATSC; and (ii) produce abnormal behavior through the manipulation of inputs of vehicle-related data (9).

intersection, slow poisoning, vehicle, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1061/9780784484326.008

2108.08627

Country:

North America > United States > Alabama > Tuscaloosa County > Tuscaloosa (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.82)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.91)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations

Badjatiya, Pinkesh, Gupta, Manish, Varma, Vasudeva

arXiv.org Artificial IntelligenceJan-15-2020

With the ever-increasing cases of hate spread on social media platforms, it is critical to design abuse detection mechanisms to proactively avoid and control such incidents. While there exist methods for hate speech detection, they stereotype words and hence suffer from inherently biased training. Bias removal has been traditionally studied for structured datasets, but we aim at bias mitigation from unstructured text data. In this paper, we make two important contributions. First, we systematically design methods to quantify the bias for any model and propose algorithms for identifying the set of words which the model stereotypes. Second, we propose novel methods leveraging knowledge-based generalizations for bias-free learning. Knowledge-based generalization provides an effective way to encode knowledge because the abstraction they provide not only generalizes content but also facilitates retraction of information from the hate speech detection classifier, thereby reducing the imbalance. We experiment with multiple knowledge generalization policies and analyze their effect on general performance and in mitigating bias. Our experiments with two real-world datasets, a Wikipedia Talk Pages dataset (WikiDetox) of size ~96k and a Twitter dataset of size ~24k, show that the use of knowledge-based generalizations results in better performance by forcing the classifier to learn from generalized content. Our methods utilize existing knowledge-bases and can easily be extended to other tasks

classifier, dataset, detection strategy, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3308558.3313504

2001.05495

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Pakistan (0.04)
Asia > India > Telangana > Hyderabad (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Information Technology (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Filters

Collaborating Authors

detection strategy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

5acd3c628aa1819fbf07c39ef73e7285-Supplemental-Datasets_and_Benchmarks_Track.pdf

Supplementary material for CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence 1 Dataset Documentations 1 1.1 Hosted URLs

Backdoor defense, learnability and obfuscation

Handling Ontology Gaps in Semantic Parsing

Decoding the AI Pen: Techniques and Challenges in Detecting AI-Generated Text

Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals

Australian universities to return to 'pen and paper' exams after students caught using AI to write essays

A Simple Unified Framework for Anomaly Detection in Deep Reinforcement Learning

An Innovative Attack Modelling and Attack Detection Approach for a Waiting Time-based Adaptive Traffic Signal Controller

Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations